Journal of Systems Engineering and Electronics ›› 2011, Vol. 33 ›› Issue (1): 202-0207.doi: 10.3969/j.issn.1001 506X.2011.01.41

• 软件、算法与仿真 • 上一篇    下一篇

采用属性相关分析的异常数据检测方法

刘波,潘久辉   

  1. 暨南大学信息科学技术学院计算机科学系, 广东 广州 510632
  • 出版日期:2011-01-20 发布日期:2010-01-03

Study of abnormal data detecting method using attribute correlation analysis

LIU Bo,PAN Jiu-hui   

  1. Department of Computer Science, College of Information Science and Technology,Jinan University, Guangzhou 510632, China
  • Online:2011-01-20 Published:2010-01-03

摘要:

为了发现数据库中的异常数据,提出了两个数据项集之间相关可信度的新概念,并研究了基于该度量的异常数据检测规则的计算算法,产生的规则适合于离散型属性孤立点的检测。在计算检测规则中,最小相关可信度阈值不需由用户指定,而是根据1〖CD*2〗数据项集的频率确定;利用相关可信度的性质,可以减小检测规则计算算法的时间复杂度。实验结果表明,采用该方法计算获得的相关规则进行异常数据检测,不仅效率较高,而且检测的准确率、查全率也较高。

Abstract:

In order to discover abnormal data in a database, a new concept of correlated confidence between two data itemsets is proposed, and the algorithm of computing the rules for detecting abnormal data based on the metric is studied. The inferred rules are suitable for detecting discrete attribute outliers. In computing the rules, the minimum threshold of correlated confidence is determined by the frequency of 1-itemsets instead of users, and the temporal complexity of the algorithm for computing rules can be reduced by using the properties of correlated confidence. The experiment results show that the correlated rules inferred by the method for detecting abnormal data have not only high efficiency but also high precision and recall.